Chunk Incremental LDA Computing on Data Streams
نویسندگان
چکیده
This paper presents a constructive method for deriving an updated discriminant eigenspace for classification, when bursts of new classes of data is being added to an initial discriminant eigenspace in the form of random chunks. The proposed Chunk incremental linear discriminant analysis (I-LDA) can effectively evolve a discriminant eigenspace over a fast and large data stream, and extract features with superior discriminability in classification, when compared with other methods.
منابع مشابه
Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift
Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-ba...
متن کاملIncremental LDA Learning by Combining Reconstructive and Discriminative Approaches
Incremental subspace methods have proven to enable efficient training if large amounts of training data have to be processed or if not all data is available in advance. In this paper we focus on incremental LDA learning which provides good classification results while it assures a compact data representation. In contrast to existing incremental LDA methods we additionally consider reconstructiv...
متن کاملA Fine-Grained, Dynamic Load Distribution Model for Parallel Stream Processing
Our goal is to address the unique characteristics and limitations of emerging large-scale commodity clusters to leverage their potential for the parallel processing of multidimensional data streams. To this end, we describe a new distributed stream processing model that integrates data and task parallelism by partitioning workloads into selfdescribing chunks that are dynamically assigned to ava...
متن کاملA Multi-partition Multi-chunk Ensemble Technique to Classify Concept-Drifting Data Streams
We propose a multi-partition, multi-chunk ensemble classifier based data mining technique to classify concept-drifting data streams. Existing ensemble techniques in classifying concept-drifting data streams follow a single-partition, single-chunk approach, in which a single data chunk is used to train one classifier. In our approach, we train a collection of v classifiers from r consecutive dat...
متن کاملRobust Textual Data Streams Mining Based on Continuous Transfer Learning
In textual data stream environment, concept drift can occur at any time, existing approaches partitioning streams into chunks can have problem if the chunk boundary does not coincide with the change point which is impossible to predict. Since concept drift can occur at any point of the streams, it will certainly occur within chunks, which is called random concept drift. The paper proposed an ap...
متن کامل